AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Image-Text Fusion

# Image-Text Fusion

Meta Llama Llama 4 Maverick 17B 128E Instruct
Other
Llama 4 Maverick is a multimodal AI model released by Meta, supporting text and image understanding. It adopts a Mixture of Experts (MoE) architecture and excels in multilingual text and code generation tasks.
Multimodal Fusion Transformers Supports Multiple Languages
M
Undi95
35
2
Liquid V1 7B
MIT
Liquid is an autoregressive generation paradigm that achieves seamless fusion of visual understanding and generation by tokenizing images into discrete codes and learning these code embeddings alongside text tokens in a shared feature space.
Text-to-Image Transformers English
L
Junfeng5
11.35k
84
Pixtral Large Instruct 2411
Other
Pixtral-Large-Instruct-2411 is a multimodal instruction fine-tuned model based on MistralAI technology, supporting image and text input with multilingual processing capabilities.
Image-to-Text Transformers Supports Multiple Languages
P
nintwentydo
23
2
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase